531 research outputs found

    Encoding of low-quality DNA profiles as genotype probability matrices for improved profile comparisons, relatedness evaluation and database searches

    Get PDF
    Many DNA profiles recovered from crime scene samples are of a quality that does not allow them to be searched against, nor entered into, databases. We propose a method for the comparison of profiles arising from two DNA samples, one or both of which can have multiple donors and be affected by low DNA template or degraded DNA. We compute likelihood ratios to evaluate the hypothesis that the two samples have a common DNA donor, and hypotheses specifying the relatedness of two donors. Our method uses a probability distribution for the genotype of the donor of interest in each sample. This distribution can be obtained from a statistical model, or we can exploit the ability of trained human experts to assess genotype probabilities, thus extracting much information that would be discarded by standard interpretation rules. Our method is compatible with established methods in simple settings, but is more widely applicable and can make better use of information than many current methods for the analysis of mixed-source, low-template DNA profiles. It can accommodate uncertainty arising from relatedness instead of or in addition to uncertainty arising from noisy genotyping. We describe a computer program GPMDNA, available under an open source license, to calculate LRs using the method presented in this paper.Comment: 28 pages. Accepted for publication 2-Sep-2016 - Forensic Science International: Genetic

    Bayesian models for syndrome- and gene-specific probabilities of novel variant pathogenicity

    Get PDF
    BACKGROUND: With the advent of affordable and comprehensive sequencing technologies, access to molecular genetics for clinical diagnostics and research applications is increasing. However, variant interpretation remains challenging, and tools that close the gap between data generation and data interpretation are urgently required. Here we present a transferable approach to help address the limitations in variant annotation. METHODS: We develop a network of Bayesian logistic regression models that integrate multiple lines of evidence to evaluate the probability that a rare variant is the cause of an individual's disease. We present models for genes causing inherited cardiac conditions, though the framework is transferable to other genes and syndromes. RESULTS: Our models report a probability of pathogenicity, rather than a categorisation into pathogenic or benign, which captures the inherent uncertainty of the prediction. We find that gene- and syndrome-specific models outperform genome-wide approaches, and that the integration of multiple lines of evidence performs better than individual predictors. The models are adaptable to incorporate new lines of evidence, and results can be combined with familial segregation data in a transparent and quantitative manner to further enhance predictions. Though the probability scale is continuous, and innately interpretable, performance summaries based on thresholds are useful for comparisons. Using a threshold probability of pathogenicity of 0.9, we obtain a positive predictive value of 0.999 and sensitivity of 0.76 for the classification of variants known to cause long QT syndrome over the three most important genes, which represents sufficient accuracy to inform clinical decision-making. A web tool APPRAISE [http://www.cardiodb.org/APPRAISE] provides access to these models and predictions. CONCLUSIONS: Our Bayesian framework provides a transparent, flexible and robust framework for the analysis and interpretation of rare genetic variants. Models tailored to specific genes outperform genome-wide approaches, and can be sufficiently accurate to inform clinical decision-making

    Verifying likelihoods for low template DNA profiles using multiple replicates

    Get PDF
    AbstractTo date there is no generally accepted method to test the validity of algorithms used to compute likelihood ratios (LR) evaluating forensic DNA profiles from low-template and/or degraded samples. An upper bound on the LR is provided by the inverse of the match probability, which is the usual measure of weight of evidence for standard DNA profiles not subject to the stochastic effects that are the hallmark of low-template profiles. However, even for low-template profiles the LR in favour of a true prosecution hypothesis should approach this bound as the number of profiling replicates increases, provided that the queried contributor is the major contributor. Moreover, for sufficiently many replicates the standard LR for mixtures is often surpassed by the low-template LR. It follows that multiple LTDNA replicates can provide stronger evidence for a contributor to a mixture than a standard analysis of a good-quality profile. Here, we examine the performance of the likeLTD software for up to eight replicate profiling runs. We consider simulated and laboratory-generated replicates as well as resampling replicates from a real crime case. We show that LRs generated by likeLTD usually do exceed the mixture LR given sufficient replicates, are bounded above by the inverse match probability and do approach this bound closely when this is expected. We also show good performance of likeLTD even when a large majority of alleles are designated as uncertain, and suggest that there can be advantages to using different profiling sensitivities for different replicates. Overall, our results support both the validity of the underlying mathematical model and its correct implementation in the likeLTD software

    Diffusional Relaxation in Random Sequential Deposition

    Full text link
    The effect of diffusional relaxation on the random sequential deposition process is studied in the limit of fast deposition. Expression for the coverage as a function of time are analytically derived for both the short-time and long-time regimes. These results are tested and compared with numerical simulations.Comment: 9 pages + 2 figure

    A Genome-Wide Association Study of the Metabolic Syndrome in Indian Asian Men

    Get PDF
    We conducted a two-stage genome-wide association study to identify common genetic variation altering risk of the metabolic syndrome and related phenotypes in Indian Asian men, who have a high prevalence of these conditions. In Stage 1, approximately 317,000 single nucleotide polymorphisms were genotyped in 2700 individuals, from which 1500 SNPs were selected to be genotyped in a further 2300 individuals. Selection for inclusion in Stage 1 was based on four metabolic syndrome component traits: HDL-cholesterol, plasma glucose and Type 2 diabetes, abdominal obesity measured by waist to hip ratio, and diastolic blood pressure. Association was tested with these four traits and a composite metabolic syndrome phenotype. Four SNPs reaching significance level p0.8 were found in genes CETP and LPL, associated with HDL-cholesterol. These associations have already been reported in Indian Asians and in Europeans. Five additional loci harboured SNPs significant at p0.5 for HDL-cholesterol, type 2 diabetes or diastolic blood pressure. Our results suggest that the primary genetic determinants of metabolic syndrome are the same in Indian Asians as in other populations, despite the higher prevalence. Further, we found little evidence of a common genetic basis for metabolic syndrome traits in our sample of Indian Asian men

    MultiBLUP: improved SNP-based prediction for complex traits.

    Get PDF
    BLUP (best linear unbiased prediction) is widely used to predict complex traits in plant and animal breeding, and increasingly in human genetics. The BLUP mathematical model, which consists of a single random effect term, was adequate when kinships were measured from pedigrees. However, when genome-wide SNPs are used to measure kinships, the BLUP model implicitly assumes that all SNPs have the same effect-size distribution, which is a severe and unnecessary limitation. We propose MultiBLUP, which extends the BLUP model to include multiple random effects, allowing greatly improved prediction when the random effects correspond to classes of SNPs with distinct effect-size variances. The SNP classes can be specified in advance, for example based on SNP functional annotations, and we also provide an adaptive procedure for determining a suitable partition of SNPs. We apply MultiBLUP to genome-wide association data from the Wellcome Trust Case Control Consortium (seven diseases), and from much larger studies of Celiac Disease and Inflammatory Bowel Disease, finding that it consistently provides better prediction than alternative methods. Moreover, MultiBLUP is computationally very efficient; for the largest dataset, which includes 12,678 individuals and 1.5M SNPs, the total analysis can be run on a single desktop PC in under a day, and can be parallelized to run even faster. Tools to perform MultiBLUP are freely available in our software LDAK

    The Rise and Fall of BritainsDNA: A Tale of Misleading Claims, Media Manipulation and Threats to Academic Freedom

    Get PDF
    Direct-to-consumer genetic ancestry testing is a new and growing industry that has gained widespread media coverage and public interest. Its scientific base is in the fields of population and evolutionary genetics and it has benefitted considerably from recent advances in rapid and cost-effective DNA typing technologies. There is a considerable body of scientific literature on the use of genetic data to make inferences about human population history, although publications on inferring the ancestry of specific individuals are rarer. Population geneticists have questioned the scientific validity of some population history inference approaches, particularly those of a more interpretative nature. These controversies have spilled over into commercial genetic ancestry testing, with some companies making sensational claims about their products. One such company—BritainsDNA—made a number of dubious claims both directly to its customers and in the media. Here we outline our scientific concerns, document the exchanges between us, BritainsDNA and the BBC, and discuss the issues raised about media promotion of commercial enterprises, academic freedom of expression, science and pseudoscience and the genetic ancestry testing industry. We provide a detailed account of this case as a resource for historians and sociologists of science, and to shape public understanding, media reporting and scientific scrutiny of the commercial use of population and evolutionary genetics

    Model of Cluster Growth and Phase Separation: Exact Results in One Dimension

    Full text link
    We present exact results for a lattice model of cluster growth in 1D. The growth mechanism involves interface hopping and pairwise annihilation supplemented by spontaneous creation of the stable-phase, +1, regions by overturning the unstable-phase, -1, spins with probability p. For cluster coarsening at phase coexistence, p=0, the conventional structure-factor scaling applies. In this limit our model falls in the class of diffusion-limited reactions A+A->inert. The +1 cluster size grows diffusively, ~t**(1/2), and the two-point correlation function obeys scaling. However, for p>0, i.e., for the dynamics of formation of stable phase from unstable phase, we find that structure-factor scaling breaks down; the length scale associated with the size of the growing +1 clusters reflects only the short-distance properties of the two-point correlations.Comment: 12 page
    • …
    corecore